Aim of the analysis

The aim of the analysis is to identify phosphorylation sites that are regulated upon antibody-based TCR activation in primary T-cells. A global analysis of our proteomic/phosphoproteomic data sets should allow us to study phospho-signalling resulting from TCR activation in primary cells.

Sample preparation and input files

Cell stimulation

Romain Roncagalli performed cell isolation and stimulation 4 times (R1: 20141204, R3: 20150306, R4: 20150713, R5: 20151103) as follows:

Day 0:

  1. Purified CD4+T cells (negative selection dynal, invitrogen cat: 114-15D) are activated in culture with 3 ug/ml of coated (O/N 4C or 4H 37C and wash 3X with PBS1X) anti-CD3 (2C11) + soluble anti-CD28 (1ug/ml). 4 ml/well of 0.625 10^6 cells/ml are distributed in 6 well plates. Use complete RPMI

Day 2:

  1. Harvest cells, wash and count
  2. Resuspend cells at 1 10^6 cells/ml with IL-2 (1/2000), and distribute them: 5ml/well (6 well plates)

Day 3:

Add 5 ml of complete medium with IL-2/ well

Day 4:

  1. Wash cells 2X with 37 C RPMI SF, count
  2. 50 10^6 of cells are resuspended in 4 ml RPMI SF, pelleted and 10 ug of biotinylated anti-CD3 + 10 ug of biotinylated anti-CD4 were added on pellets. (for ab stimulation). 2X 50 10^6 were prepared for each condition.
  3. Pellets are resuspended in 100 ul of 37C SF RPMI and incubated for 5 min at 37C
  4. 28 ug (100 ul) of purified Streptavidin (SAV) for ab stimulation and incubated for different time points 37C
  5. Stimulation was stopped by freezing tube in liquid nitrogen

Sample preparation for MS analysis

Peptide samples

After trypsin digestion, 5ug of each peptide sample was injected on the Q-Exactive plus for relative protein quantification.

phospho-samples

Carine Froment performed TiO2 enrichments, injected 3 to 4 times 2.5% of them (labelled “TiO2”) and performed a phospho-tyrosine IP on the 90% remaining (injected in 3 to 4 technical replicates - corresponds to 22.5% of starting material). Samples were injected onto a Q-Exactive Plus for label-free MS relative quantification. The analysis was performed using Maxquant.

Input files

The input files are all in the folder RAW/:

  • PeptideSamples/: output protein table from the MaxQuant analysis of all peptide samples prior to TiO2 and phospho-tyrosine enrichment.
  • TiO2/: output phosphosites tables from the MaxQuant analysis of 10% of the samples after TiO2 and before phospho-tyrosine enrichment. Each biological replicate was searched independently.
  • pYIP/: output phosphosites tables from the MaxQuant analysis of the samples after TiO2 and phospho-tyrosine enrichment. Each biological replicate was searched independently.

Multisites parsing

MaxQuant returns a table with one row per phosphorylation site. It returns the quantification values for the multiply phosphorylated site but does not map them to the corresponding pairs (or triplets). I use a script to identify the quantification values comming from the multiply phosphorylated peptides and match them.

I keep only the sites with a PEP value <= 0.01.

I correct some issues with protein IDs:

I create a list with for each phosphorylation site ID its ID with gene name (for figures) and the list of all the mono-phosphorylations it corresponds to.

Data normalisation using iRTs

In order to correct for technical variation due to the instrument with spided-in synthetic peptides. I use their intensity to normalise the data. I keep only the iRTs with a CV < 50%, and take as a reference the median of the runs (iRT signal) in the 4th biological repeat. This is to keep the values close to the ones of the NS in the R4, where there was an issue with the spike and I can’t perform normalisation.

## [1] 1625700000

The signal is very low for the iRTs in the TiO2_R4 unstimulated. I decide to not normalise these runs to avoid creating a bias with very low normalisation factors…

Data cleaning

For the replicate 1 (R1) of the TiO2, we noticed that the injection 1 were not ran in optimum conditions. There are more than 50% missing values in these runs. We remove them.

I also remove the R1_S120.Inj3bis that has a lower intensity than the others, which would impact the statistical analysis. And the RR_ProFI_S4_R4_CCF01414_EpTyr_NS.Inj3.


Number of sites

There are 10047 unique sites or combination of sites (from multiply phosphorylated peptides) in the whole data set. These correspond to 9560 individual phosphorylation sites.

## [1] "Number of unique site in the analysis:"
## [1] 10047
## [1] "Number of unique site in the pYIP:"
## [1] 560

## [1] "Number of unique site in the TiO2:"
## [1] 9702

## quartz_off_screen 
##                 2

For figure 1

From now on, I will differenciate phosphorylation sites that have a localisation score >75% (Class 1), >0.5 and under or equal to 0.75 (Class 2) and >0.25 and under or equal to 0.5 (Class 3).

The multiply-phosphorylated sites do not get a score. For sites identified multiple times, I keep the highest localisation score for the pie chart.

Var1 value Sample
1 Class 1 5398 TiO2
5 Class 1 377 pYIP
2 Class 2 1652 TiO2
6 Class 2 70 pYIP
3 Class 3 1904 TiO2
7 Class 3 67 pYIP
4 LowLocScore 280 TiO2
8 LowLocScore 19 pYIP
## quartz_off_screen 
##                 2

For the monophosphorylations, I keep the phosphosites Class 1 and Class 2.

After this filter, there are 7810 unique phosphosites.

Number of phosphorylated amino-acids identified in the study.

Total number of pS, pT, pY identified: 7418, 1793, 349. These are present on 2331 proteins.

## [1] "Number of identified phosphorylated amino acid:"
## $TiO2
##       TiO2_R1 TiO2_R3 TiO2_R4 TiO2_R5
## S        4690    4995    5395    5817
## T         716     824     919    1117
## Y          51      60      71     116
## total    5457    5879    6385    7050
## 
## $pYIP
##       pYIP_R1 pYIP_R3 pYIP_R5
## S         133     138     140
## T          37      43      45
## Y         236     262     262
## total     406     443     447
## [1] "Number of identified phosphorylated amino acid that are not reported in the mouse PhosphoSitePlus data:"
## $TiO2
##   TiO2_R1 TiO2_R3 TiO2_R4 TiO2_R5
## S     454     493     570     700
## T     141     171     191     270
## Y      13      15      18      39
## 
## $pYIP
##   pYIP_R1 pYIP_R4 pYIP_R5
## S      13      14      14
## T      14      15      16
## Y      33      38      38

Number of each amino-acid in the table:

## quartz_off_screen 
##                 2

## quartz_off_screen 
##                 2

I think that the high number of S/T that we find in the pYIP can be explained by the contiguous sites in multiply-phosphorylated peptides.


Data merge and normalisation

For the monophosphorylations, I keep the phosphosites Class 1 and Class 2 before merging the tables. I log2-transform the data set.

Pairwise plot with log2 values observed per condition per biological replicate:

## null device 
##           1

QC figures

I make a figure with correlation between all runs for the supplementary data of the paper:

## Warning: Removed 3230 rows containing non-finite values (stat_bin).

## Warning: Removed 3230 rows containing non-finite values (stat_bin).
## quartz_off_screen 
##                 2

## Warning: Removed 5604 rows containing non-finite values (stat_bin).

## Warning: Removed 5604 rows containing non-finite values (stat_bin).
## quartz_off_screen 
##                 2

## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.4
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] corrplot_0.84   knitr_1.21      gplots_3.0.1.1  ggplot2_3.1.0  
## [5] reshape2_1.4.3  venneuler_1.1-0 rJava_0.9-10   
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0         highr_0.7          pillar_1.3.1      
##  [4] compiler_3.5.2     plyr_1.8.4         bindr_0.1.1       
##  [7] bitops_1.0-6       tools_3.5.2        digest_0.6.18     
## [10] evaluate_0.12      tibble_2.0.1       gtable_0.2.0      
## [13] pkgconfig_2.0.2    rlang_0.3.1        yaml_2.2.0        
## [16] xfun_0.4           bindrcpp_0.2.2     withr_2.1.2       
## [19] stringr_1.3.1      dplyr_0.7.8        caTools_1.17.1.1  
## [22] gtools_3.8.1       grid_3.5.2         tidyselect_0.2.5  
## [25] glue_1.3.0         R6_2.3.0           rmarkdown_1.11    
## [28] gdata_2.18.0       purrr_0.2.5        magrittr_1.5      
## [31] scales_1.0.0       htmltools_0.3.6    assertthat_0.2.0  
## [34] colorspace_1.4-0   labeling_0.3       KernSmooth_2.23-15
## [37] stringi_1.2.4      lazyeval_0.2.1     munsell_0.5.0     
## [40] crayon_1.3.4